Skip to content

Conversation

@psocolovsky
Copy link
Contributor

@psocolovsky psocolovsky commented May 18, 2025

llama internally supports load cancellation, but when using common* API it's not available from outside:

struct common_init_result common_init_from_params(common_params & params) {
    common_init_result iparams;
    auto mparams = common_model_params_to_llama(params);

    llama_model * model = llama_model_load_from_file(params.model.path.c_str(), mparams); // <<< ----
    if (model == NULL) {
        LOG_ERR("%s: failed to load model '%s'\n", __func__, params.model.path.c_str());
        return iparams;
    }
[...]

to allow load cancellation it needs a progress_callback, but it requires passing common_params and internally llama_model_params builds the object always with progress_callback as null.

this PR simply provides an opportunity to set it from outside. however I called it load_progress_callback because the callback is used during model loading. user will still need to call llama_set_abort_callback for compute cancellation, but it can be done after calling common_init_from_params().

@ggerganov ggerganov added the merge ready indicates that this may be ready to merge soon and is just holding out in case of objections label May 19, 2025
Co-authored-by: Sigbjørn Skjæret <[email protected]>
@CISC CISC merged commit 1dfbf2c into ggml-org:master May 19, 2025
46 checks passed
infil00p pushed a commit to baseweight/llama.cpp that referenced this pull request May 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge ready indicates that this may be ready to merge soon and is just holding out in case of objections

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants